Jaegul Choo, Georgia
Institute of Technology, joyfull@cc.gatech.edu
[PRIMARY contact]
Hanseung Lee, Georgia Institute of Technology, joyfull@cc.gatech.edu
Jaeeun Shim, Georgia Institute of Technology, jaeeun.shim@gatech.edu
Emily Fujimoto, Harvey Mudd College, efujimoto@hmc.edu
SocioJigsaw was
implemented by Dr. Carsten Görg and based on Jigsaw's ListView, a visualization
tool where named entities—such as locations and people's names—are extracted
from documents and the relations between them are visualized in various ways.
SocioJigsaw has two views: one is the modified ListView mentioned above, and
the other is GeoView, which displays the links between the cities of a given
map based on which entities are selected in the ListView. ListView provides the
user with as many columns for data as needed depending on the entry types, and
shows the links between entries in different columns. In SocioJigsaw, we have 4
columns corresponding to Employee, Handler, Middleman, and Leader, and the
program will filter out people who do not meet the specified number of contacts
for that role. Each entry has the contact name and the city he lives in. On the
left side of each entry is its number of contacts represented in the form of a
small horizontal bar graph. The program can—through a right, single, or double
click—highlight a specific entry's contacts in selected columns and sort
entries based on the number of contacts, alphabetical order, or whether the
entry is highlighted. SocioJigsaw also features the means to relax the
specification of each role in terms of the number of contacts. Since the
possible number of contacts overlaps between Employee and Handler, the user can
move entries between those two columns. For more details about using Jigsaw,
please refer to http://www.cc.gatech.edu/gvu/ii/jigsaw/.
Video:
ANSWERS:
MC2.1: Which of the two social structures, A or B, most
closely match the scenario you have identified in the data?
A
MC2.2: Provide the social network structure you
have identified as a tab delimitated file. It should contain the employee, one
or more handler, any middle folks, and the localized leader with their
international contacts. What are the Flitter names of the persons involved?
Please identify only key connections (not all single links for example) as well
as any other nodes related to the scenario (if any) you may have discovered
that were not described in the two scenarios A and B above.
MC2.3: Characterize the difference between your social
network and the closest social structure you selected (A or B). If you include
extra nodes please explain how they fit in to your scenario or analysis.
First,
we tested Structure A. We began with the 10 Employees who have 40 or 41
contacts. We then looked at their potential Handlers: people that those
Employees knew with between 30 and 40 contacts, and tried to find Employees
that had at least three potential Handlers. We accomplished this by taking the
set of all Employees and selecting each of them by ctrl+clicking the first and
the last entries. Then, by right-clicking and selecting "Expand
Handlers", we obtain Figure 1. As we can immediately see, @schaffter and
@supornpaibul have three connections to potential Handlers. This demonstrates
one advantage of SocioJigsaw; we don't need to examine the social network of
all 10 possible Employees. Instead we can simply look at the links from
multiple Employees to potential Handlers by themselves.
First,
we decided to look at @schaffter. We highlight him and his connection to
possible Handlers by selecting him and choosing to expand Handlers again. We
then put these Handlers on top by clicking the fourth icon above the column
(the one with a green arrow pointing upwards). Then by selecting all such Handlers,
expanding the Handlers, and then sorting in terms of the number of contacts we
get the potential Middleman, as shown in Figure 2. However, in the Handler
column, we can see the first two highlighted entries now have a
"->" next to them. This indicates that those entries are linked to
other people identified as potential Handlers. This is a warning since we know
that the Handlers should not be connected to each other. However, the arrows
point to potential Handlers who are not among @schaffter's Handlers, so we know
that his three Handlers are not connected to each other. From the Middleman
column, we can now see that all three Handlers are connected to @good, but not
to any other entries, so @good can be viewed as the Middleman.
We
go further by selecting @good, and choosing to "Expand Leaders" to
obtain a single entry in the Leader column, which is @szemeredi with 256
Flitter contacts. Now we have a completed structure that matches Structure A.
If you repeat these steps for the other potential Employee, @supornpaibul, you
find that his network doesn't match Structure A because his Handlers do not
have a common Middleman.
However,
this batch approach that investigates multiple nodes in each role at once has a
weakness: due to the overlap between the expected number of contacts for the
Employee and Handler, if one of the potential Employees is also possibly a
Handler, that person will be omitted from the Handler column. To get around
this, instead of selecting all the possible Employees and expanding their
possible Handlers all at once, this time we clicked each of possible Employees
one at a time. This added @lafouge to our list of potential Employees. Figure 4
shows this case. At a glance, it seems @lafouge has only two connections to
potential Handlers. However, as shown in Figure 4, he has another connection to
a potential Employee, @krintz, who has exactly 40 contacts, making @krintz a
possible Handler. To explore @lafouge as an Employee further, SocioJigsaw
enables us to manually exchange the entries between the Employee and Handler
columns. After moving @krintz to the Handler column and expanding those three
Handler's Middlemen, we obtain Figure 5. We can see that none of the Middlemen
have a connection to all three of the Handlers, so this case does not fit
Structure A.
We
used similar approach for Structure B. From potential Employees, we can start
with @supornpaibul, who failed to fit Structure A at Middleman stage. This
time, we can go on to looking for the Leader since the Handlers don't need to
share a common Middleman. However, as we expand the Leaders of the Middlemen,
no three Middlemen share a common Leader, so @supornpaibul network doesn't
match Structure B. Likewise, the same discrepancy occurs when looking at
@lafouge as the potential Employee.
When
we relax the number of contacts in each role, in case of Structure A, no new
solutions appear unless we relax the conditions quite a bit. However, in the
case of Structure B, a small amount of relaxation produces some solutions, and
the number of solutions gradually grows as the requirements are relaxed more
and more. This means the original solution found for Structure A is an outlier
in the data, making it more likely to be the criminal network, while Structure
B, after the requirements are relaxed some, is easily found in the networks of
ordinary people. Furthermore, relaxing the conditions meant that SocioJigsaw
had to deal with more data, but despite this it managed to handle the data set
well because of its batch mode visualizations that show all the candidates'
links in one step.
MC2.4: How is your hypothesis about the social
structure in Part 1 supported by the city locations of Flovania? What part(s),
if any, did the role of geographical information play in the social network of
part one?
By
ctrl+clicking to select all the entries found in MC2.2, our GeoView displays
the selected entries as shown in Figure 6. Each role is encoded using different
color. The Empoyee and Handlers live in Prounov, a large city, and the
Middleman lives in Kannvic, a smaller city close to Prounov. This matches the
expected geospatial clues. However, the Leader is in a mid-sized city, which
does not match the given geospatial information. By selecting only the leader,
GeoView shows that entry's number of contacts across cities, as shown in Figure
7. We can also see that most of his contacts are in Koul and Prounov, and he
has 14 international contacts in Otello, Transpasko, and Tulamuk.
MC2.5: In general, how are the Flitter users dispersed throughout the cities of this challenge? Which of the surrounding countries may have ties to this criminal operation? Why might some be of more significant concern than others? (150)
In
general, more Flitter users can be found in larger cities. To view the
distribution of contacts between cities, GeoView interacts with the mouse
pointer, as shown in Figures 8 and 9. In an ordinary social network, as
distance increases between cities, the number of contacts should decrease,
which is shown to be true in most cases. Instances conflicting with this form
an abnormal network and are possibly part of the criminal organization.
To
see if one of the foreign cities might be of more concern than the others we
focused on the distribution of the potential Leaders' contacts. To do that, we
sorted potential Leaders based on their contacts, and examined each of their
distributions over the cities to see if one city had significantly more, like
in Figure 7. However, the distribution seemed fairly even, so we couldn't draw
any interesting conclusions.
Figures
(resized)
Figure 1
Figure 2
Figure
3
Figure
4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Due to submission error, It couldn’t
be included. Please refer to the link below for Figure 9.
Figures
Links (original size)